How to Choose Between isdigit(), isdecimal() and isnumeric() in Python | 您所在的位置:网站首页 › isdigit isnumeric › How to Choose Between isdigit(), isdecimal() and isnumeric() in Python |
In this post, you'll learn the subtle difference between str.isdigit, str.isdecimal, and str.isnumeric in Python 3 and how to choose the best one for the job. When processing strings, usually by reading them from some source, you might want to check if the given string is a number. The string class (str) comes with 3 different methods that you can use for that purpose. Each of them has pros and cons, and distinguishing the difference between them will save you tons of development and debugging time. In this article, you will: learn what str.isdigit(), str.isdecimal(), and str.isnumeric() do, their limitations, how to use them, and when you should use them understand the difference difference between isdigit vs isnumeric vs isdecimal understand why isdigit,isnumeric, or isdecimal is not working for you how to solve common problems that cannot be easily solved with them, such as: how to make sure a float number string is digit how to use isdigit,isnumeric, or isdecimal with negative numbers Table of Contents How isdigit() Works and When to Use It How isdecimal() Works and When to Use It How isnumeric() Works and When to Use ItSolving Common Problems 4.1. How to Check if Float Numbers Are Digits? 4.2. How to Check if Negative Numbers Are Digits? 4.3. Why isdigit Is Not Working for Me? Conclusion How isdigit() Works and When to Use Itstr.isdigit() is the most obvious choice if you want to determine if a string - or a character - is a digit in Python. According to its documentation, this method returns True if all characters in the string are digits and it's not empty, otherwise it will return False. Let's see some examples: # all characters in the string are digits >>> '102030'.isdigit() True # 'a' is not a digit >>> '102030a'.isdigit() False # isdigit fails if there's whitespace >>> ' 102030'.isdigit() False # it must be at least one char long >>> ''.isdigit() False # dots '.' are also not digit >>> '12.5'.isdigit() FalseUnlike many people think, isdigit is not a function but a method in the str, bytes, and bytearray classes. This works well for these simpler cases, but what happens if a string has a space? # ' ' (space) is not a digit In [8]: ' 102030'.isdigit() Out[8]: FalseThis fails because the string contains a space in the beginning. As a result, we cannot use it as it is to read from unprocessed sources such as the input() function. You must always remember to preprocess the input before checking with isdigit(). That might be one of the reasons isdigit is not working for you. Despite this strict behavior, isdigit has some gotchas. If we read the documentation carefully, it says that the method can also handle "superscript digits". Digits include decimal characters and digits that need special handling, such as the compatibility superscript digits. But how does that work? Will it return True for strings with superscripts such as 2⁷? >>> d = '2' + '\u2077' >>> d '2⁷' >>> d.isdigit() True # it accepts superscripts only >>> '⁵'.isdigit() True # and superscripts first followed by a number >>> '⁵5'.isdigit() TrueIt turns out it does! You can actually use it with input(): >>> a = input('Enter a number:') Enter a number:2⁷ >>> a '2⁷' >>> a.isdigit() TrueEven though it works well with superscripts, it doesn't handle fractions chars. This method is really about single digits. # fractions in Unicode are not digits >>> '⅕'.isdigit() FalseAs we can see, str.isdigit() works really well with Unicode characters. If we take a look at the unit test suite for this method, we can see some interesting test cases. # https://github.com/python/cpython/blob/3.10/Lib/test/test_unicode.py#L704 def test_isdigit(self): super().test_isdigit() self.checkequalnofix(True, '\u2460', 'isdigit') self.checkequalnofix(False, '\xbc', 'isdigit') self.checkequalnofix(True, '\u0660', 'isdigit') for ch in ['\U00010401', '\U00010427', '\U00010429', '\U0001044E', '\U0001F40D', '\U0001F46F', '\U00011065']: self.assertFalse(ch.isdigit(), '{!a} is not a digit.'.format(ch)) for ch in ['\U0001D7F6', '\U00011066', '\U000104A0', '\U0001F107']: self.assertTrue(ch.isdigit(), '{!a} is a digit.'.format(ch))The image below shows some of these test cases. str.isdigit() works really well with numeric Unicode Unicode characters that don't represent digits are not accepted Summary of What isdigit Cannot DoCan it handle whitespace? No Can it handle hexadecimal? No Does it raise exception? No Does it accept negative digits (with minus sign)? No When to Use it?Use str.isdigit when you want to verify that each and every character in a string is a single digit, that is, not punctuation, not a letter, and not negative. How isdecimal() Works and When to Use ItThe str.isdecimal() method is very similar, it returns True if all chars are decimal characters and the string is not empty. This means that superscripts are NOT decimal numbers, thus they'll return False. >>> '5'.isdecimal() True >>> '⁵'.isdecimal() False >>> '5⁵'.isdecimal() False >>> '-4'.isdecimal() False >>> '4.5'.isdecimal() FalseSuperscripts are not decimal numbers isdecimal also accepts Unicode characters that are used to form numbers in base 10 in other languages. For example, the Arabic-Indic digit zero is considered a decimal, as a result '٠'.isdecimal() returns true. >>> '٠'.isdecimal() TrueArabic-Indic such as '٠' decimal in base 10 When to Use it?Use str.isdecimal when you want to verify that each and every character in a string can form a base 10 number. Since punctuation, superscripts, letters, and minus sign are not decimals, they'll return False. How isnumeric() Works and When to Use ItThis one overlaps significantly with isdigit and isdecimal. According to the documentation, isnumeric returns True if all characters string are numeric and must not be empty. The key difference here is the word numeric. What is the difference between a numeric character and a digit character? The difference is that a digit is a single Unicode value whereas a numeric character is any Unicode symbol that represents a numeric value, and that includes fractions! Not only that, isnumeric works well with roman numerals! Let's see some examples in action. >>> '⅕'.isdigit() False >>> '⅕'.isnumeric() True >>> '⁵'.isnumeric() True '5⁵'.isnumeric() TrueUse str.isnumeric when you want to verify that each and every character in a string is a valid numeric char, including fractions, superscripts and roman numbers. Since punctuation, letters, and minus sign are not numeric values, they'll evaluate to False. Solving Common ProblemsIn this section, we'll see how to fix the most common problems when using isdigit, isnumeric, and isdecimal. How to Check if Float Numbers Are Digits?The best way to check that is to try to cast it to float. It the float constructor doesn't raise any exceptions, then the string is a valid float. This is a pythonic idiom called EAFP (Easier to ask for forgiveness than permission). def is_float_digit(n: str) -> bool: try: float(n) return True except ValueError: return False >>> is_float_digit('23.45') True >>> is_float_digit('23.45a') FalseCAUTION: This string method does not work with superscript! The only way to verify that is to replace the '.' and then calling `isdigit()' on it. >>> is_float_digit('23.45⁵') False def is_float_digit_v2(n: str) -> bool: return n.replace('.', '', 1).isdigit() >>> is_float_digit_v2('23.45⁵') True How to Check if Negative Numbers Are Digits?Checking numbers starting with minus sign depend on the target type. Since we're talking about digits here, it makes sense to assert if the string can be converted to int. This is very similar to the EAFP approach discussed for floats. However, just like the previous approach, it doesn't handle superscripts. def is_negative_number_digit(n: str) -> bool: try: int(n) return True except ValueError: return False >>> is_negative_number_digit('-2345') True >>> is_negative_number_digit('-2345⁵') FalseTo fix that, the best way is to strip the minus sign. def is_negative_number_digit_v2(n: str) -> bool: return n.lstrip('-').isdigit() >>> is_negative_number_digit_v2('-2345') True >>> is_negative_number_digit_v2('-2345⁵') True Why isdigit Is Not Working for Me?The most common issue that prevents isdigit / isnumeric / isdecimal to work properly is having a leading or trailing whitespace in the string. Before using them it's imperative to remove any leading or trailing whitespace, or other character such as newline (\n)). >>> ' 54'.isdigit() False >>> ' 54'.strip().isdigit() True >>> ' 54'.isnumeric() False >>> ' 54'.strip().isnumeric() True >>> ' 54'.isdecimal() False >>> ' 54'.strip().isdecimal() True >>> '65\n'.isdigit() False >>> '65\n'.strip().isdigit() True ConclusionIn this post, we saw the subtle difference between isdigit vs isdecimal vs isnumeric and how to choose the most appropriate string method for your use case. We also saw cases that cannot be dealt with them alone and how to overcome those limitations. That's it for today and I hope you've enjoyed this post! References: docs.python.org/3/library/stdtypes.html fileformat.info/info/unicode/char/0660/brow.. stackoverflow.com/a/7643705 stackoverflow.com/a/28279773 |
CopyRight 2018-2019 实验室设备网 版权所有 |